Asymptotic and finite-sample properties of estimators based on stochastic gradients∗

نویسندگان

  • Panos Toulis
  • Edoardo M. Airoldi
  • Joe Blitzstein
  • Leon Bottou
  • Bob Carpenter
  • David Dunson
  • Andrew Gelman
  • Brian Kulis
  • Xiao-Li Meng
  • Natesh Pillai
چکیده

Stochastic gradient descent procedures have gained popularity for parameter estimation from large data sets. However, their statistical properties are not well understood, in theory. And in practice, avoiding numerical instability requires careful tuning of key parameters. Here, we introduce implicit stochastic gradient descent procedures, which involve parameter updates that are implicitly defined. Intuitively, implicit updates shrink standard stochastic gradient descent updates. The amount of shrinkage depends on the observed Fisher information matrix, which does not need to be explicitly computed; thus, implicit procedures increase stability without increasing the computational burden. Our theoretical analysis provides the first full characterization of the asymptotic behavior of both standard and implicit stochastic gradient descent-based estimators, including finite-sample error bounds. Importantly, analytical expressions for the variances of these stochastic gradient-based estimators reveal their exact loss of efficiency. We also develop new algorithms to compute implicit stochastic gradient descentbased estimators for generalized linear models, Cox proportional hazards, M-estimators, in practice, and perform extensive experiments. Our results suggest that implicit stochastic gradient descent procedures are poised to become a workhorse for approximate inference from large data sets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Asymptotic Distributions of Estimators of Eigenvalues and Eigenfunctions in Functional Data

Functional data analysis is a relatively new and rapidly growing area of statistics. This is partly due to technological advancements which have made it possible to generate new types of data that are in the form of curves. Because the data are functions, they lie in function spaces, which are of infinite dimension. To analyse functional data, one way, which is widely used, is to employ princip...

متن کامل

Ridge Stochastic Restricted Estimators in Semiparametric Linear Measurement Error Models

In this article we consider the stochastic restricted ridge estimation in semipara-metric linear models when the covariates are measured with additive errors. The development of penalized corrected likelihood method in such model is the basis for derivation of ridge estimates. The asymptotic normality of the resulting estimates are established. Also, necessary and sufficient condition...

متن کامل

ar X iv : 1 40 9 . 00 55 v 1 [ m at h . ST ] 2 9 A ug 2 01 4 On Asymptotic Normality of the Local Polynomial Regression Estimator with Stochastic Bandwidths

Abstract. Nonparametric density and regression estimators commonly depend on a bandwidth. The asymptotic properties of these estimators have been widely studied when bandwidths are nonstochastic. In practice, however, in order to improve finite sample performance of these estimators, bandwidths are selected by data driven methods, such as cross-validation or plug-in procedures. As a result nonp...

متن کامل

Asymptotic Behaviors of Nearest Neighbor Kernel Density Estimator in Left-truncated Data

Kernel density estimators are the basic tools for density estimation in non-parametric statistics.  The k-nearest neighbor kernel estimators represent a special form of kernel density estimators, in  which  the  bandwidth  is varied depending on the location of the sample points. In this paper‎, we  initially introduce the k-nearest neighbor kernel density estimator in the random left-truncatio...

متن کامل

Wavelets for Nonparametric Stochastic Regression with Pairwise Negative Quadrant Dependent Random Variables

We propose a wavelet based stochastic regression function estimator for the estimation of the regression function for a sequence of pairwise negative quadrant dependent random variables with a common one-dimensional probability density function. Some asymptotic properties of the proposed estimator are investigated. It is found that the estimators have similar properties to their counterparts st...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016